AITopics | training rate

Collaborating Authors

training rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding Shuhan T an

Neural Information Processing SystemsOct-8-2025, 20:29:47 GMT

We observe that Uniform on average achieves better results.

artificial intelligence, machine learning, video, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Travis County > Austin (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.41)

Add feedback

Transfer Learning Under High-Dimensional Graph Convolutional Regression Model for Node Classification

Chen, Jiachen, Huang, Danyang, Wang, Liyuan, Lunetta, Kathryn L., Mukherjee, Debarghya, Cheng, Huimin

arXiv.org Machine LearningMay-26-2024

Node classification is a fundamental task, but obtaining node classification labels can be challenging and expensive in many real-world scenarios. Transfer learning has emerged as a promising solution to address this challenge by leveraging knowledge from source domains to enhance learning in a target domain. Existing transfer learning methods for node classification primarily focus on integrating Graph Convolutional Networks (GCNs) with various transfer learning techniques. While these approaches have shown promising results, they often suffer from a lack of theoretical guarantees, restrictive conditions, and high sensitivity to hyperparameter choices. To overcome these limitations, we propose a Graph Convolutional Multinomial Logistic Regression (GCR) model and a transfer learning method based on the GCR model, called Trans-GCR. We provide theoretical guarantees of the estimate obtained under GCR model in high-dimensional settings. Moreover, Trans-GCR demonstrates superior empirical performance, has a low computational cost, and requires fewer hyperparameters than existing methods.

matrix, probability, training rate, (16 more...)

arXiv.org Machine Learning

2405.16672

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.84)

Add feedback

Siamese Residual Neural Network for Musical Shape Evaluation in Piano Performance Assessment

Li, Xiaoquan, Weiss, Stephan, Yan, Yijun, Li, Yinhe, Ren, Jinchang, Soraghan, John, Gong, Ming

arXiv.org Artificial IntelligenceJan-4-2024

Understanding and identifying musical shape plays an important role in music education and performance assessment. To simplify the otherwise time- and cost-intensive musical shape evaluation, in this paper we explore how artificial intelligence (AI) driven models can be applied. Considering musical shape evaluation as a classification problem, a light-weight Siamese residual neural network (S-ResNN) is proposed to automatically identify musical shapes. To assess the proposed approach in the context of piano musical shape evaluation, we have generated a new dataset, containing 4116 music pieces derived by 147 piano preparatory exercises and performed in 28 categories of musical shapes. The experimental results show that the S-ResNN significantly outperforms a number of benchmark methods in terms of the precision, recall and F1 score.

music piece, musical shape, musical shape evaluation, (11 more...)

arXiv.org Artificial Intelligence

2401.02566

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Media > Music (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Opening the Black Box: Towards inherently interpretable energy data imputation models using building physics insight

Liguori, Antonio, Quintana, Matias, Fu, Chun, Miller, Clayton, Frisch, Jérôme, van Treeck, Christoph

arXiv.org Machine LearningNov-28-2023

Missing data are frequently observed by practitioners and researchers in the building energy modeling community. In this regard, advanced data-driven solutions, such as Deep Learning methods, are typically required to reflect the non-linear behavior of these anomalies. As an ongoing research question related to Deep Learning, a model's applicability to limited data settings can be explored by introducing prior knowledge in the network. This same strategy can also lead to more interpretable predictions, hence facilitating the field application of the approach. For that purpose, the aim of this paper is to propose the use of Physics-informed Denoising Autoencoders (PI-DAE) for missing data imputation in commercial buildings. In particular, the presented method enforces physics-inspired soft constraints to the loss function of a Denoising Autoencoder (DAE). In order to quantify the benefits of the physical component, an ablation study between different DAE configurations is conducted. First, three univariate DAEs are optimized separately on indoor air temperature, heating, and cooling data. Then, two multivariate DAEs are derived from the previous configurations. Eventually, a building thermal balance equation is coupled to the last multivariate configuration to obtain PI-DAE. Additionally, two commonly used benchmarks are employed to support the findings. It is shown how introducing physical knowledge in a multivariate Denoising Autoencoder can enhance the inherent model interpretability through the optimized physics-based coefficients. While no significant improvement is observed in terms of reconstruction error with the proposed PI-DAE, its enhanced robustness to varying rates of missing data and the valuable insights derived from the physics-based coefficients create opportunities for wider applications within building systems and the built environment.

air temperature, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2311.16632

Country:

North America > United States > California (0.14)
Europe > Germany > North Rhine-Westphalia (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.88)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas (0.68)
Energy > Renewable (0.68)
Construction & Engineering > HVAC (0.49)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating Negative Transfer in Multi-Task Learning with Exponential Moving Average Loss Weighting Strategies

Lakkapragada, Anish, Sleiman, Essam, Surabhi, Saimourya, Wall, Dennis P.

arXiv.org Artificial IntelligenceNov-22-2022

Multi-Task Learning (MTL) is a growing subject of interest in deep learning, due to its ability to train models more efficiently on multiple tasks compared to using a group of conventional single-task models. However, MTL can be impractical as certain tasks can dominate training and hurt performance in others, thus making some tasks perform better in a single-task model compared to a multi-task one. Such problems are broadly classified as negative transfer, and many prior approaches in the literature have been made to mitigate these issues. One such current approach to alleviate negative transfer is to weight each of the losses so that they are on the same scale. Whereas current loss balancing approaches rely on either optimization or complex numerical analysis, none directly scale the losses based on their observed magnitudes. We propose multiple techniques for loss balancing based on scaling by the exponential moving average and benchmark them against current best-performing methods on three established datasets. On these datasets, they achieve comparable, if not higher, performance compared to current best-performing methods.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2211.12999

Country:

North America > United States > California > Yolo County > Davis (0.14)
North America > United States > California > Santa Clara County > Stanford (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

FedGradNorm: Personalized Federated Gradient-Normalized Multi-Task Learning

Mortaheb, Matin, Vahapoglu, Cemil, Ulukus, Sennur

arXiv.org Machine LearningMar-24-2022

Multi-task learning (MTL) is a novel framework to learn several tasks simultaneously with a single shared network where each task has its distinct personalized header network for fine-tuning. MTL can be implemented in federated learning settings as well, in which tasks are distributed across clients. In federated settings, the statistical heterogeneity due to different task complexities and data heterogeneity due to non-iid nature of local datasets can both degrade the learning performance of the system. In addition, tasks can negatively affect each other's learning performance due to negative transference effects. To cope with these challenges, we propose FedGradNorm which uses a dynamic-weighting method to normalize gradient norms in order to balance learning speeds among different tasks. FedGradNorm improves the overall learning performance in a personalized federated learning setting. We provide convergence analysis for FedGradNorm by showing that it has an exponential convergence rate. We also conduct experiments on multi-task facial landmark (MTFL) and wireless communication system dataset (RadComDynamic). The experimental results show that our framework can achieve faster training performance compared to equal-weighting strategy. In addition to improving training speed, FedGradNorm also compensates for the imbalanced datasets among clients.

artificial intelligence, fedgradnorm, machine learning, (15 more...)

arXiv.org Machine Learning

2203.13663

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

S-multi-SNE: Semi-Supervised Classification and Visualisation of Multi-View Data

Rodosthenous, Theodoulos, Shahrezaei, Vahid, Evangelou, Marina

arXiv.org Machine LearningNov-5-2021

An increasing number of multi-view data are being published by studies in several fields. This type of data corresponds to multiple data-views, each representing a different aspect of the same set of samples. We have recently proposed multi-SNE, an extension of t-SNE, that produces a single visualisation of multi-view data. The multi-SNE approach provides low-dimensional embeddings of the samples, produced by being updated iteratively through the different data-views. Here, we further extend multi-SNE to a semi-supervised approach, that classifies unlabelled samples by regarding the labelling information as an extra data-view. We look deeper into the performance, limitations and strengths of multi-SNE and its extension, S-multi-SNE, by applying the two methods on various multi-view datasets with different challenges. We show that by including the labelling information, the projection of the samples improves drastically and it is accompanied by a strong classification performance.

algorithm, dataset, s-multi-sne, (16 more...)

arXiv.org Machine Learning

2111.03519

Country:

Europe > United Kingdom (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.66)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Network Transfer Learning via Adversarial Domain Adaptation with Graph Convolution

Dai, Quanyu, Shen, Xiao, Wu, Xiao-Ming, Wang, Dan

arXiv.org Machine LearningSep-3-2019

Abstract--This paper studies the problem of cross-network node classification to overcome the insufficiency of labeled data in a single network. It aims to leverage the label information in a partially labeled source network to assist node classification in a completely unlabeled or partially labeled target network. Existing methods for single network learning cannot solve this problem due to the domain shift across networks. Some multi-network learning methods heavily rely on the existence of cross-network connections, thus are inapplicable for this problem. T o tackle this problem, we propose a novel network transfer learning framework AdaGCN by leveraging the techniques of adversarial domain adaptation and graph convolution. It consists of two components: a semi-supervised learning component and an adversarial domain adaptation component. The former aims to learn class discriminative node representations with given label information of the source and target networks, while the latter contributes to mitigating the distribution divergence between the source and target domains to facilitate knowledge transfer. Extensive empirical evaluations on real-world datasets show that AdaGCN can successfully transfer class information with a low label rate on the source network and a substantial divergence between the source and target domains. Codes will be released upon acceptance. It is an important building block of numerous real-world applications, such as product recommendation in e-commerce websites, advertisement distribution in social networks, and protein function identification for disease diagnosis. Many research efforts have been made to develop reliable and efficient methods for node classification in networked data. In the era of big data, massive amount of raw data in information networks is produced everyday . However, labeled data is significantly expensive and slow to acquire due to the high cost and long time of human annotations, making it difficult to train a well-generalized classifier [2]. Moreover, in some newly-formed networks such as a protein-protein interaction network constructed by some researchers, there may be no labels at all. Hence, it would be impossible to classify the nodes with only the information of this network. T o tackle these issues, a promising approach is to utilize class information from other similar or related networks to assist in classification, i.e., transfer learning on networked data [3], [4].

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Machine Learning

1909.01541

Country: Asia > China > Hong Kong (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Information Technology > Services > e-Commerce Services (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

5 algorithms to train a neural network

#artificialintelligenceApr-1-2018, 21:09:54 GMT

The procedure used to carry out the learning process in a neural network is called the training algorithm. There are many different training algorithms, with different characteristics and performance. The learning problem in neural networks is formulated in terms of the minimization of a loss function, f. This function is in general, composed of an error and a regularization terms. The error term evaluates how a neural network fits the data set. On the other hand, the regularization term is used to prevent overfitting, by controlling the effective complexity of the neural network.

artificial intelligence, machine learning, neural network, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

training rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

6a412f0037b0df295a39a198666ea6a6-Supplemental-Conference.pdf

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding Shuhan T an

Transfer Learning Under High-Dimensional Graph Convolutional Regression Model for Node Classification

Siamese Residual Neural Network for Musical Shape Evaluation in Piano Performance Assessment

Opening the Black Box: Towards inherently interpretable energy data imputation models using building physics insight

Mitigating Negative Transfer in Multi-Task Learning with Exponential Moving Average Loss Weighting Strategies

FedGradNorm: Personalized Federated Gradient-Normalized Multi-Task Learning

S-multi-SNE: Semi-Supervised Classification and Visualisation of Multi-View Data

Network Transfer Learning via Adversarial Domain Adaptation with Graph Convolution

5 algorithms to train a neural network